Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data

نویسندگان

Shah Atiqur Rahman

Yuxiao Huang

Jan Claassen

Nathaniel Heintzman

Samantha Kleinberg

چکیده

Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fractional Regression Nearest Neighbor Imputation

Sample surveys typically gather information on a sample of units from a finite population and assign survey weights to the sampled units. Survey frequently have missing values for some variables for some units. Fractional regression imputation creates multiple values for each missing value by adding randomly selected empirical residuals to predicted values. Fractional imputation methods assign ...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

متن کامل

KNN-DTW Based Missing Value Imputation for Microarray Time Series Data

Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods for missing value imputation in gene expression data are needed since many algorithms for gene analysis require a complete matrix of gene arra...

متن کامل

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of biomedical informatics

دوره 58 شماره

صفحات -

تاریخ انتشار 2015

Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data

نویسندگان

چکیده

منابع مشابه

Fractional Regression Nearest Neighbor Imputation

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

KNN-DTW Based Missing Value Imputation for Microarray Time Series Data

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

عنوان ژورنال:

اشتراک گذاری